Here I show three diagnostics: Convergence and efficiency, Traceplots, and Pairs plots.

Convergence and efficiency

From the Stan documentation on convergence and efficiency diagnostics for Markov Chains:

The Rhat function produces R-hat convergence diagnostic, which compares the between- and within-chain estimates for model parameters and other univariate quantities of interest. If chains have not mixed well (ie, the between- and within-chain estimates don’t agree), R-hat is larger than 1. We recommend running at least four chains by default and only using the sample if R-hat is less than 1.05. Stan reports R-hat which is the maximum of rank normalized split-R-hat and rank normalized folded-split-R-hat, which works for thick tailed distributions and is sensitive also to differences in scale.

The ess_bulk function produces an estimated Bulk Effective Sample Size (bulk-ESS) using rank normalized draws. Bulk-ESS is useful measure for sampling efficiency in the bulk of the distribution (related e.g. to efficiency of mean and median estimates), and is well defined even if the chains do not have finite mean or variance.

The ess_tail function produces an estimated Tail Effective Sample Size (tail-ESS) by computing the minimum of effective sample sizes for 5% and 95% quantiles. Tail-ESS is useful measure for sampling efficiency in the tails of the distribution (related e.g. to efficiency of variance and tail quantile estimates).

Both bulk-ESS and tail-ESS should be at least 100 (approximately) per Markov Chain in order to be reliable and indicate that estimates of respective posterior quantiles are reliable.

In the plots below, I will show these three diagnostics for the models run for each of our six populations.


R-hat statistics. R-hat values should be less than 1.05, and all parameters here have an R-hat of less than 1.01.
R-hat statistics. R-hat values should be less than 1.05, and all parameters here have an R-hat of less than 1.01.


Bulk Effective Sample Size (bulk-ESS). bulk-ESS should be at least 100 in order for posteriors to be reliable, but a bulk-ESS greater than the number of draws (which in this case is 4 chains x 1000 iter) indicates anticorrelation between draws, which reduces the reliability of the variance estimate. bulk-ESS indicates that all parameters have a sufficient sample size, but a handful of parameters exhibit a slight degree of anticorrelation (bulk-ESS > 4,000).
Bulk Effective Sample Size (bulk-ESS). bulk-ESS should be at least 100 in order for posteriors to be reliable, but a bulk-ESS greater than the number of draws (which in this case is 4 chains x 1000 iter) indicates anticorrelation between draws, which reduces the reliability of the variance estimate. bulk-ESS indicates that all parameters have a sufficient sample size, but a handful of parameters exhibit a slight degree of anticorrelation (bulk-ESS > 4,000).


Tail Effective Sample Size (tail-ESS). tail-ESS should be at least 100 in order for posteriors to be reliable, but a tail-ESS greater than the number of draws (which in this case is 4 chains x 1000 iter) indicates anticorrelation between draws, which reduces the reliability of the variance estimate. tail-ESS indicates that all parameters have a sufficient sample size, but a handful of parameters exhibit a slight degree of anticorrelation (tail-ESS > 4,000).
Tail Effective Sample Size (tail-ESS). tail-ESS should be at least 100 in order for posteriors to be reliable, but a tail-ESS greater than the number of draws (which in this case is 4 chains x 1000 iter) indicates anticorrelation between draws, which reduces the reliability of the variance estimate. tail-ESS indicates that all parameters have a sufficient sample size, but a handful of parameters exhibit a slight degree of anticorrelation (tail-ESS > 4,000).


Traceplots

Given that each model is estimating hundreds of parameters, traceplots for all parameters will not be shown here. Instead, traceplots for a few parameters from each model will be shown here.

Upper Columbia Wild traceplots.
Upper Columbia Wild traceplots.


Upper Columbia Hatchery traceplots.
Upper Columbia Hatchery traceplots.


Middle Columbia Wild traceplots.
Middle Columbia Wild traceplots.


Middle Columbia Hatchery traceplots.
Middle Columbia Hatchery traceplots.


Snake River Wild traceplots.
Snake River Wild traceplots.


Snake River Hatchery traceplots.
Snake River Hatchery traceplots.


Pairs plots

Pairs plots show univariate histograms and bivariate scatter plots for selected parameters, and are especially useful for identifying collinearity between variables (which manifests as narrow bivariate plots) as well as the presence of multiplicative non-identifiabilities (banana-like shapes).

For these pairs plots, I am showing all of the different parameters that exist for a single movement (e.g., the intercept, origin, and temperature parameters that govern the probability of a single movement, such as ascending a specific dam). Again, there are too many movements across the six models to show all of these plots, but I will show examples of potential issues related to collinearity in our predictors. A description of collinearity of predictors in regressions can be found in the Stan User Guide.

After making the change such that each fish receives only either a DPS-specific or an origin-specific intercept term (which fixed nonsensical collinearity issues between origin and intercept parameters), there are still two common cases of collinearity that are important to be aware of for interpreting parameter estimates:

  1. Collinearity between temperature and intercept parameters, for ascending mainstem dams: Temperature and intercept collinearity
  2. Collinearity between intercept and spill parameters, for en-route fallback:Intercept and spill collinearity


Temperature and intercept collinearity

Upper Columbia Wild Steelhead, ascending McNary Dam (moving from state 2, BON to MCN, to state 3, MCN to PRA or ICH).
Upper Columbia Wild Steelhead, ascending McNary Dam (moving from state 2, BON to MCN, to state 3, MCN to PRA or ICH).



Snake River Hatchery Steelhead, ascending McNary Dam (moving from state 2, BON to MCN, to state 3, MCN to PRA or ICH).
Snake River Hatchery Steelhead, ascending McNary Dam (moving from state 2, BON to MCN, to state 3, MCN to PRA or ICH).



For ascents of dams, particularly those that are downstream of natal tributaries, the intercept parameter (b0) and the temperature parameter (btemp1) that estimates the effect of river temperature during summer/fall (Jun 1 - Dec 31) are highly negatively correlated. Shown above are two pairs plots for the parameters that govern the probability of ascending McNary Dam for Upper Columbia Wild Steelhead and Snake River Hatchery Steelhead. As seen in the plots, b0 and btemp1 are strongly negatively correlated, while b0 and btemp0 (the effect of river temperature during winter/spring, Jan 1 - May 1) are also negatively correlated.

My understanding of why this is happening is that this upstream movement is the dominant movement out of this state (as reaching natal tributaries for all fish from these DPSs requires ascending McNary Dam), and this upstream movement is also happening when temperatures are highest (in alignment with migration in the late summer/early fall). The negative collinearity arises from the pattern that when the intercept (b0) draw is low, the temperature (btemp1) draw must be high to explain the high probability of that movement, and vice versa.

Interestingly, the parameter estimates for the effect of temperature on ascending McNary Dam (a necessary movement for these fish to reach their natal tributaries) indicate that warmer temperatures actually decrease the probability of making this ascent, therefore making other movements (such as movements into tribuatries like the Deschutes, or into the loss state) more likely in warmer conditions.

Of course, temperature and day of year are strongly linked, as seen by these plots showing counts of ascents of McNary Dam by day of year and temperature, for Upper Columbia and Snake River Steelhead:


Upper Columbia River Steelhead ascents at McNary Dam by day of year and temperature.
Upper Columbia River Steelhead ascents at McNary Dam by day of year and temperature.


Snake River Steelhead ascents at McNary Dam by day of year and temperature.
Snake River Steelhead ascents at McNary Dam by day of year and temperature.



Intercept and spill collinearity

Upper Columbia Wild Steelhead, falling back over Bonneville Dam (moving from state 2, which is BON to MCN, to state 1, which is mouth to BON).
Upper Columbia Wild Steelhead, falling back over Bonneville Dam (moving from state 2, which is BON to MCN, to state 1, which is mouth to BON).


For fallback movements, especially for fallback over BON, the intercept (b0_matrix_2_1) and spill window (bspillwindow_matrix_2_1) parameters are negatively correlated, with a similar pattern as that seen for the relationship between the btemp and b0 parameters shown above.

The same pattern is observed for spill as is observed for temperature: fallback events occur primarily at high spill levels, but also with distinct seasonality (these en-route fallback events are happening primarily when fish have just ascended Bonneville Dam in late summer). The negative collinearity arises from the pattern that when the intercept (b0) draw is low, the spill (bspillwindow) draw must be high to explain the observed movements, and vice versa.

Upper Columbia River Steelhead fallback events at Bonneville Dam by day of year and spill volume.
Upper Columbia River Steelhead fallback events at Bonneville Dam by day of year and spill volume.



Pairs plots conclusions

The correlation between temperature + intercept and spill + intercept don’t really concern me too much - the posteriors seem to be properly behaved. They just need to be interpreted as representing seasonality as much as they are representing the covariates themselves.

Big picture: The correlation between parameters emphasizes that parameter estimates cannot be interpreted in isolation. Instead, posterior predictive estimates of movements are necessary to interpret the meaning of parameters.